Rank Cover Trees for Nearest Neighbor Search

نویسندگان

  • Michael E. Houle
  • Michael Nett
چکیده

Virtually all known distance-based similarity search indexes make use of some form of numerical constraints (triangle inequality, additive distance bounds, . . . ) on similarity values for pruning and selection. The use of such numerical constraints, however, often leads to large variations in the numbers of objects examined in the execution of a query, making it difficult to control the execution costs. We introduce a probabilistic data structure for similarity search, the Rank Cover Tree (Rct), that entirely avoids the use of numerical constraints. All internal selections are made according to the ranks of the objects with respect to the query, allowing much tighter control on the overall execution costs. A rank-based probabilistic analysis shows that with very high probability, the Rct returns a correct query result in time that depends competitively on a measure of the intrinsic dimensionality of the data set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Search Space Reductions for Nearest-Neighbor Queries

The vast number of applications featuring multimedia and geometric data has made the R-tree a ubiquitous data structure in databases. A popular and fundamental operation on R-trees is nearest neighbor search. While nearest neighbor on R-trees has received considerable experimental attention, it has received somewhat less theoretical consideration. We study pruning heuristics for nearest neighbo...

متن کامل

Evaluation Accuracy of Nearest Neighbor Sampling Method in Zagross Forests

Collection of appropriate qualitative and quantitative data is necessary for proper management and planning. Used the suitable inventory methods is necessary and accuracy of sampling methods dependent the inventory net and number of sample point. Nearest neighbor sampling method is a one of distance methods and calculated by three equations (Byth and Riple, 1980; Cotam and Curtis, 1956 and Cota...

متن کامل

Metrized Small World Approach for Nearest Neighbor Search

In different areas attempts are made to organize data into multi-linked structures which are well suited for information search, in particular the nearest neighbor search where the result data items are metrically close to a given data item. These structures often take the form of trees (M-Tree, cover tree, KDtree, GNAT) or networks (M-Chord, VoroNet, RayNet) built over a set of data items. In ...

متن کامل

Evaluation Accuracy of Nearest Neighbor Sampling Method in Zagross Forests

Collection of appropriate qualitative and quantitative data is necessary for proper management and planning. Used the suitable inventory methods is necessary and accuracy of sampling methods dependent the inventory net and number of sample point. Nearest neighbor sampling method is a one of distance methods and calculated by three equations (Byth and Riple, 1980; Cotam and Curtis, 1956 and Cota...

متن کامل

Which Space Partitioning Tree to Use for Search - Summary

Trees like binary-space-partitioning trees, kd-trees, principal axis trees and random projection trees are used to answer the question ”which tree to use for nearest-neighbor search?.” This paper deals with the influence of the vector quantization performance of the trees on the search performance and the margins of the partitions in these trees. Theoretical results show that both factors have ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013